Querying Versioned Software Repositories
نویسندگان
چکیده
Large parts of today’s data is stored in text documents that undergo a series of changes during their lifetime. For instance during the development of a software product the source code changes frequently. Currently, managing such data relies on version control systems (VCSs). Extracting information from large documents and their different versions is a manual and tedious process. We present Qvestor, a system that allows to declaratively query documents. It leverages information about the structure of a document that is available as a context-free grammar and allows to declaratively query document versions through a grammar annotated with relational algebra expressions. We define and illustrate the annotation of grammars with relational algebra expressions and show how to translate the annotations to easy to use SQL views. DOI: https://doi.org/10.1007/978-3-642-23737-9_4 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-56409 Accepted Version Originally published at: Christopeit, Dietrich; Böhlen, Michael H; Kanne, Carl-Christian; Mazeika, Arturas (2011). Querying versioned software repositories. In: 15th international conference on Advances in databases and information systems , Vienna, Austria, 20 September 2011 23 September 2011, 42-55. DOI: https://doi.org/10.1007/978-3-642-23737-9_4 Querying Versioned Software Repositories Dietrich Christopeit, Michael Böhlen, Carl-Christian Kanne, Arturas Mazeika [email protected], [email protected], [email protected], [email protected] Abstract. Large parts of today’s data is stored in text documents that undergo a series of changes during their lifetime. For instance during the development of a software product the source code changes frequently. Currently, managing such data relies on version control systems (VCSs). Extracting information from large documents and their different versions is a manual and tedious process. We present Qvestor, a system that allows to declaratively query documents. It leverages information about the structure of a document that is available as a context-free grammar and allows to declaratively query document versions through a grammar annotated with relational algebra expressions. We define and illustrate the annotation of grammars with relational algebra expressions and show how to translate the annotations to easy to use SQL views. Large parts of today’s data is stored in text documents that undergo a series of changes during their lifetime. For instance during the development of a software product the source code changes frequently. Currently, managing such data relies on version control systems (VCSs). Extracting information from large documents and their different versions is a manual and tedious process. We present Qvestor, a system that allows to declaratively query documents. It leverages information about the structure of a document that is available as a context-free grammar and allows to declaratively query document versions through a grammar annotated with relational algebra expressions. We define and illustrate the annotation of grammars with relational algebra expressions and show how to translate the annotations to easy to use SQL views.
منابع مشابه
Interacting with local and remote data repositories using the stashR package
The stashR package (a Set of Tools for Administering Shared Repositories) for R implements a basic versioned key-value style database where character string keys are associated with data values. Using the S4 classes ‘localDB’ and ‘remoteDB’, and associated methods, versioned key-value databases can be either created locally on the user’s computer or accessed remotely via the Internet. The stash...
متن کاملA logic foundation for a general-purpose history querying tool
Version control systems (VCS) have become indispensable software development tools. The version snapshots they store to provide support for change coordination and release management, effectively track the evolution of the versioned software and its development process. Despite this wealth of historical information, it has only been leveraged by tools that are dedicated to a specific task such ...
متن کاملIndexing Highly Repetitive Collections
The need to index and search huge highly repetitive sequence collections is rapidly arising in various fields, including computational biology, software repositories, versioned collections, and others. In this short survey we briefly describe the progress made along three research lines to address the problem: compressed suffix arrays, grammar compressed indexes, and Lempel-Ziv compressed indexes.
متن کاملA Comparison of Top-k Temporal Keyword Querying over Versioned Text Collections
As the web evolves over time, the amount of versioned text collections increases rapidly. Most web search engines will answer a query by ranking all known documents at the (current) time the query is posed. There are applications however (for example customer behavior analysis, crime investigation, etc.) that would need to efficiently query these sources as of some past time, that is, retrieve ...
متن کاملIntegrating software engineering tools and repositories with XML and XSLT
Interoperability between heterogeneous repositories and applications is often needed in Internet-based software development. At present XML is increasingly being used to integrate repositories and to express data fetched from various sources, but mismatches are encountered between the schemas of different repositories. XSLT is typically used to stylize results, but this does not utilize the ful...
متن کامل